Goto

Collaborating Authors

 Burbank


10 media moments and controversies that defined 2025

FOX News

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by Refinitiv Lipper . Trace Gallagher: This year's resolution is for the'naughty nightly news' Chicago mayor endorses'Abolish ICE' snowplow name NYT writer downplays MN fraud scandal investigation from'politicized' DOJ CBS News correspondent claims Supreme Court corruption narrative is'patently false' Sanders rails against AI, says'science-fiction fear' of it running the world not an outrageous idea Pelosi says she didn't intend to tear up Trump's 2020 State of the Union speech MS NOW guest praises Trump's'unconventional' approach to foreign policy (1) LA Mayor Karen Bass says it's'sad' to see Latinos joining the Border Patrol Santa is'PACKING HEAT' during a traffic stop Joe Rogan roasts'crazy' White House plaques installed by Trump Jimmy Kimmel criticized for'ridiculous' Christmas message Jimmy Kimmel jabs at Trump on Christmas: 'Tyranny is booming' CBS News defends pulling '60 Minutes' story'Jesus Crown of Thorns' season 2 is available to watch now on Fox Nation Kimmel says'tyranny is booming' under Trump in UK Christmas message Sunday Morning Futures anchor Maria Bartiromo looks back at her 2025 interviews with President Donald Trump as he laid out his agenda on the border, the economy, energy and foreign policy heading into 2026. NEW You can now listen to Fox News articles!



Tree Search for LLM Agent Reinforcement Learning

Ji, Yuxiang, Ma, Ziyu, Wang, Yong, Chen, Guanhua, Chu, Xiangxiang, Wu, Liaoni

arXiv.org Artificial Intelligence

Recent advances in reinforcement learning (RL) have significantly enhanced the agentic capabilities of large language models (LLMs). In long-term and multi-turn agent tasks, existing approaches driven solely by outcome rewards often suffer from the problem of sparse supervision. To address the challenge, we propose Tree-based Group Relative Policy Optimization (Tree-GRPO), a grouped agent RL method based on tree search, where each tree node represents the complete agent interaction step. By sharing common prefixes, the tree search sampling increases the number of rollouts achievable within a fixed budget of tokens or tool calls. Moreover, we find that the tree-structured trajectory naturally allows the construction of step-wise process supervised signals even using only the outcome reward. Based on this, Tree-GRPO estimates the grouped relative advantages both on intra-tree and inter-tree levels. Through theoretical analysis, we demonstrate that the objective of intra-tree level group relative policy optimization is equivalent to that of step-level direct preference learning. Experiments across 11 datasets and 3 types of QA tasks demonstrate the superiority of the proposed tree-based RL over the chain-based RL method.Figure 1: Comparison of chain-based and tree-based sampling strategies in LLM multi-turn agent RL. The tree structure brings two major advantages: (i) less rollout budget (both on tokens and tool-calls); (ii) higher performance. Reinforcement Learning (RL) has emerged as a pivotal post-training paradigm for Large Language Models (LLMs), catalyzing the development of several frontier models (DeepSeek-AI Team, 2025; Y ang et al., 2025a; OpenAI, 2024). RL-tuned LLMs trained only with outcome rewards acquire complex reasoning abilities and achieve remarkable gains in single-turn tasks, such as mathematical proof and code generation (Team et al., 2025b; Y u et al., 2025; Chu et al., 2025a; Shao et al., 2024; Xin et al., 2024). This suggests that LLMs can learn not only through static imitation, but also by actively interacting with dynamic environments. Guided by this prospect, recent works have extended this RL paradigm to more complex agent settings involving dynamic, multi-turn interactions (Feng et al., 2025b; Singh et al., 2025; Wang et al., 2025b; Qian et al., 2025; Feng et al., Work done during internship at AMAP, Alibaba Group. Right (Ours): Tree search with nodes corresponding to complete agent step.


A$^2$Search: Ambiguity-Aware Question Answering with Reinforcement Learning

Zhang, Fengji, Niu, Xinyao, Ying, Chengyang, Lin, Guancheng, Hao, Zhongkai, Fan, Zhou, Huang, Chengen, Keung, Jacky, Chen, Bei, Lin, Junyang

arXiv.org Artificial Intelligence

Recent advances in Large Language Models (LLMs) and Reinforcement Learning (RL) have led to strong performance in open-domain question answering (QA). However, existing models still struggle with questions that admit multiple valid answers. Standard QA benchmarks, which typically assume a single gold answer, overlook this reality and thus produce inappropriate training signals. Existing attempts to handle ambiguity often rely on costly manual annotation, which is difficult to scale to multi-hop datasets such as HotpotQA and MuSiQue. In this paper, we present A$^2$Search, an annotation-free, end-to-end training framework to recognize and handle ambiguity. At its core is an automated pipeline that detects ambiguous questions and gathers alternative answers via trajectory sampling and evidence verification. The model is then optimized with RL using a carefully designed $\mathrm{AnsF1}$ reward, which naturally accommodates multiple answers. Experiments on eight open-domain QA benchmarks demonstrate that A$^2$Search achieves new state-of-the-art performance. With only a single rollout, A$^2$Search-7B yields an average $\mathrm{AnsF1}@1$ score of $48.4\%$ across four multi-hop benchmarks, outperforming all strong baselines, including the substantially larger ReSearch-32B ($46.2\%$). Extensive analyses further show that A$^2$Search resolves ambiguity and generalizes across benchmarks, highlighting that embracing ambiguity is essential for building more reliable QA systems. Our code, data, and model weights can be found at https://github.com/zfj1998/A2Search


Signs of dyslexia and reading troubles can be spotted in kindergarten -- or even preschool

Los Angeles Times

Things to Do in L.A. Tap to enable a layout that focuses on the article. Vanessa Silver, who tutors young children with dyslexia, works with Liina Yerro, 9, in Granada Hills. This is read by an automated voice. Please report any issues or inconsistencies here . California to begin universal screening of kindergarten through second-grade students for reading difficulties, including dyslexia.


Unsupervised Entity Alignment Based on Personalized Discriminative Rooted Tree

Yang, Yaming, Wang, Zhe, Guan, Ziyu, Zhao, Wei, Huang, Xinyan, He, Xiaofei

arXiv.org Artificial Intelligence

Entity Alignment (EA) is to link potential equivalent entities across different knowledge graphs (KGs). Most existing EA methods are supervised as they require the supervision of seed alignments, i.e., manually specified aligned entity pairs. Very recently, several EA studies have made some attempts to get rid of seed alignments. Despite achieving preliminary progress, they still suffer two limitations: (1) The entity embeddings produced by their GNN-like encoders lack personalization since some of the aggregation subpaths are shared between different entities. (2) They cannot fully alleviate the distribution distortion issue between candidate KGs due to the absence of the supervised signal. In this work, we propose a novel unsupervised entity alignment approach called UNEA to address the above two issues. First, we parametrically sample a tree neighborhood rooted at each entity, and accordingly develop a tree attention aggregation mechanism to extract a personalized embedding for each entity. Second, we introduce an auxiliary task of maximizing the mutual information between the input and the output of the KG encoder, to regularize the model and prevent the distribution distortion. Extensive experiments show that our UNEA achieves a new state-of-the-art for the unsupervised EA task, and can even outperform many existing supervised EA baselines.


Don't Buy a Tesla. Sell Your Tesla. Refuse a Tesla at the Rental Counter. Yes--It Will Help.

Slate

Sign up for the Slatest to get the most insightful analysis, criticism, and advice out there, delivered to your inbox daily. There are myriad reasons to loathe Elon Musk, the CEO of Tesla, who has become a top ally of Donald Trump. OG haters have long accused Musk of endangering road users by exaggerating the capabilities of Tesla's navigation assistance systems, misleadingly named Autopilot and Full-Self Driving. The ranks of the angry have steadily grown, fueled by Musk's habit of amplifying trans-bashing and antisemitism as well as his demolition of Twitter. Now, as Musk cozies up to extremists across Europe, wields the Department of Government Efficiency as a wrecking ball against the federal government, and generally acts as an unelected leader, the furor is reaching a fever pitch.


CondAmbigQA: A Benchmark and Dataset for Conditional Ambiguous Question Answering

Li, Zongxi, Li, Yang, Xie, Haoran, Qin, S. Joe

arXiv.org Artificial Intelligence

Large language models (LLMs) are prone to hallucinations in question-answering (QA) tasks when faced with ambiguous questions. Users often assume that LLMs share their cognitive alignment, a mutual understanding of context, intent, and implicit details, leading them to omit critical information in the queries. However, LLMs generate responses based on assumptions that can misalign with user intent, which may be perceived as hallucinations if they misalign with the user's intent. Therefore, identifying those implicit assumptions is crucial to resolve ambiguities in QA. Prior work, such as AmbigQA, reduces ambiguity in queries via human-annotated clarifications, which is not feasible in real application. Meanwhile, ASQA compiles AmbigQA's short answers into long-form responses but inherits human biases and fails capture explicit logical distinctions that differentiates the answers. We introduce Conditional Ambiguous Question-Answering (CondAmbigQA), a benchmark with 200 ambiguous queries and condition-aware evaluation metrics. Our study pioneers the concept of ``conditions'' in ambiguous QA tasks, where conditions stand for contextual constraints or assumptions that resolve ambiguities. The retrieval-based annotation strategy uses retrieved Wikipedia fragments to identify possible interpretations for a given query as its conditions and annotate the answers through those conditions. Such a strategy minimizes human bias introduced by different knowledge levels among annotators. By fixing retrieval results, CondAmbigQA evaluates how RAG systems leverage conditions to resolve ambiguities. Experiments show that models considering conditions before answering improve performance by $20\%$, with an additional $5\%$ gain when conditions are explicitly provided. These results underscore the value of conditional reasoning in QA, offering researchers tools to rigorously evaluate ambiguity resolution.


Learning Curve: The new players in Congress

FOX News

Fox News senior congressional correspondent Chad Pergram joins'Fox News Live' to explain how he prepares to report on Congress for the upcoming year. Every two years, the period between the November election and when the new Congress begins is often the busiest swath of time for covering Congress. Reporters are trying to figure out who won their elections and who lost. The existing Congress is back, attempting to prevent a government shutdown and often plowing through a landscape of other major legislation. There are often leadership elections.


US probing Elon Musk's Tesla over self-driving systems

BBC News

NHTSA's preliminary evaluation follows four crash reports involving the use of Tesla's "Full Self-Driving", or FSD, software. The agency said the crashes involved reduced roadway visibility, with fog or glares from the sun. One of the incidents involved a Telsa fatally striking a pedestrian, and another involved someone being injured, NHTSA said. The evaluation aims to determine if Tesla's self-driving systems can detect and appropriately respond to reduced visibility conditions. It also will examine if other self-driving crashes have happened under similar conditions.